Extracting inormation carrying observed variables with log-linear models based on samples for the observed and latent variables from a deep generative model

In this example for a post-hoc approach, we use samples from trained scVI models to infer the learned connection between observed and latent variables.

Loading of packages

First we load some necessary packages and functions.

Loading of the gene expression data used for training

We need this data for transferring the labels of real observations to the samples based on patterns in the observed variables learned by the log-linear models.

VAE

Importing Samples learned by scVI

Discretization of samples for observed and latent variables.

Inspecting the expression of selected genes

Having selected essential genes, we could now inspect, how these are expressed in different cell types. Imagine for example we were in the situation, that we dont have cell-type labels. In these example, we have but these serve purely as ground truth to check, how well the log-linear approach works in extracting the essential genes. In our example, we were able to extract the below mentioned genes and now are interested in further exploring their expression profile.

First we extract the corresponding expression and log transform and standardize the data.

We can then see how the genes are expressed within the cells given the ground truth labels.

However, since in our imagined example, we would not have the cell labels and instead would like to investigate how the genes are expresed within the cells, we could then check for the expression in the different cells. This can then be a starting point, for further analyses.

For a performance comparison, we now also extract essential genes with LDVAE

LDVAE

The same steps as presented for VAEs are shown here as well except for the in depth exploration of the extracted genes.